Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container meta info on module level #2682

Merged
merged 4 commits into from
Oct 5, 2016

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Oct 5, 2016

Most of the docker module metric sets have kind of meta information about the container which is identical across the MetricSets. Until now this information was nested inside each MetricSet which made it very hard to query for all information related to one container. To make querying possible this information was moved to the module level.

Each MetricSet can add information to the module level by adding information to the event with the _module key. This convention is based on the assumption that this key is never used for any other field. During processing of the event this information is moved up to the module level.

It must be made sure that the mapping for this data is added to the fields.yml on the module level. In case of the docker module this is not necessary as the container MetricSet already has the mappings for these values.

An example event looks as following. This makes querying based on docker.container.id simple.

{
    "@timestamp": "2016-05-23T08:05:34.853Z",
    "docker": {
        "container": {
            "id": "2d82c01e5e8a975e8a2912f5dcd7a03ec1505f2c7d2fcf54961582d8e9e34509",
            "name": "admiring_ptolemy",
            "socket": "unix:///var/run/docker.sock"
        },
        "cpu": {
            "usage": {
                "kernel_mode": 0,
                "per_cpu": {
                    "0": 0,
                    "1": 0
                },
                "total": 0,
                "user_mode": 0
            }
        }
    },
    "metricset": {
        "host": "localhost",
        "module": "docker",
        "name": "cpu",
        "rtt": 115
    },
    "type": "metricsets"
}

Additional Changes:

  • Set default time period to 10s
  • Make event generation in test based on events builder to make sure events are the same as sent
  • Remove container info from fields.yml files as not needed anymore. Update mappings.

@ruflin ruflin added review Metricbeat Metricbeat labels Oct 5, 2016
@ruflin ruflin force-pushed the mb-container-meta branch from 702f3da to fe1a5d1 Compare October 5, 2016 07:57
@tsg
Copy link
Contributor

tsg commented Oct 5, 2016

This is great 👍

@tsg
Copy link
Contributor

tsg commented Oct 5, 2016

Implementation wise, we might at some point be more explicit about it, like returning two dicts from the Fetch methods. But I think this is fine for now.

@ruflin
Copy link
Member Author

ruflin commented Oct 5, 2016

@tsg This is currently fully based on a convention to not break existing MetricSets. We could return two objects or follow the approach from @urso in outputs with the Data struct: https://github.com/elastic/beats/blob/master/libbeat/outputs/outputs.go#L19

I would also like to see that the logic for the enrichment would be on the module level. Means we would have for example in the Docker module a method AddMetaInfo that has the logic inside to add this meta data based on the values provided by the MetricSets. This could allow more complex cases and not every MetricSet would have to add it to the event itself. This is very similar to the processors we discussed. Perhaps a module is also a Processor? ... Just some thoughts.

@ruflin ruflin force-pushed the mb-container-meta branch 2 times, most recently from 5ef4d83 to 81fdc2c Compare October 5, 2016 08:42
},
}

// In case meta data exists, it is added on the module level
if ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the ok variable name is a bit confusing here, because it's set more than a few lines before. Perhaps metadataExists, or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to moduleData and moduleDataExists to make it sure it does not get confused with the global meta data.

@ruflin ruflin force-pushed the mb-container-meta branch from 81fdc2c to f425c8d Compare October 5, 2016 11:58
@ruflin
Copy link
Member Author

ruflin commented Oct 5, 2016

@tsg I changed the convention to _module as the other one already conflicted in the system module. _module is also more obvious when modifying the metricset. I pushed a new version and also updated the PR docs to not forget about it.

@@ -15,31 +15,31 @@ const (
// eventBuilder is used for building MetricSet events. MetricSets generate a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/eventBuilder/EventBuilder/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Host string
StartTime time.Time
FetchDuration time.Duration
Event common.MapStr
fetchErr error
filters *processors.Processors
metadata common.EventMetadata
}

// build builds an event from MetricSet data and applies the Module-level
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/build/Build/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Host string
StartTime time.Time
FetchDuration time.Duration
Event common.MapStr
fetchErr error
filters *processors.Processors
metadata common.EventMetadata
}

// build builds an event from MetricSet data and applies the Module-level
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider diff'ing a before and after golint ./... | grep -v vendor output to look for new warnings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, haven't checked golint for quite some time :-( Looks like Metricbeat needs some cleanup, especially the docker module. Will do that in an other PR (after fixing the parts of this PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO on the list here for this one: #2629

"type": typeName,
// Checks if additional meta information is provided by the MetricSet under the key _module
// This is based on the convention that each MetricSet can provide module data under the key _module
moduleData, moudleDataExists := event["_module"]
Copy link
Member

@andrewkroh andrewkroh Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making _module an exported constant. It might need to be in the mb package in order for MetricSet implementation to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I introduce mb.MODULE_DATA. This will simplify changes in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MODULE_DATA isn't a conventional Go name. Even constants should be mixed case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to ModuleData

},
}

// In case meta data exists, it is added on the module level
if moudleDataExists {
event[b.ModuleName].(common.MapStr).Update(moduleData.(common.MapStr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should test if it's a common.MapStr otherwise this could cause a panic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a validation, but as the common.MapStr is assinged just a few lines before I think it could also be assume that this should never panic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to testing that moduleData is a common.MapStr. I didn't see where that was validated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Will check that one instead ;-)

@ruflin ruflin force-pushed the mb-container-meta branch 2 times, most recently from d88a365 to d8431ac Compare October 5, 2016 12:21
@tsg
Copy link
Contributor

tsg commented Oct 5, 2016

I like the _module convention more.

@ruflin ruflin force-pushed the mb-container-meta branch from d8431ac to bee093b Compare October 5, 2016 13:10
ruflin added 3 commits October 5, 2016 16:41
Most of the docker module metric sets have kind of meta information about the container which is identical across the MetricSets. Until now this information was nested inside each MetricSet which made it very hard to query for all information related to one container. To make querying possible this information was moved to the module level.

Each MetricSet can add information to the module level by adding information to the event with the `_module` key. This convention is based on the assumption that this key is never used for any other field. During processing of the event this information is moved up to the module level.

It must be made sure that the mapping for this data is added to the fields.yml on the module level. In case of the docker module this is not necessary as the container MetricSet already has the mappings for these values.

An example event looks as following. This makes querying based on `docker.container.id` simple.

```
{
    "@timestamp": "2016-05-23T08:05:34.853Z",
    "docker": {
        "container": {
            "id": "2d82c01e5e8a975e8a2912f5dcd7a03ec1505f2c7d2fcf54961582d8e9e34509",
            "name": "admiring_ptolemy",
            "socket": "unix:///var/run/docker.sock"
        },
        "cpu": {
            "usage": {
                "kernel_mode": 0,
                "per_cpu": {
                    "0": 0,
                    "1": 0
                },
                "total": 0,
                "user_mode": 0
            }
        }
    },
    "metricset": {
        "host": "localhost",
        "module": "docker",
        "name": "cpu",
        "rtt": 115
    },
    "type": "metricsets"
}
```

Additional Changes:

* Set default time period to 10s
* Make event generation in test based on events builder to make sure events are the same as sent
* Remove container info from fields.yml files as not needed anymore. Update mappings.
* Fix . mapping in memory.rss to be compatible with elasticsearch < 2.4
@ruflin ruflin force-pushed the mb-container-meta branch from 4e4ee08 to a5f1d73 Compare October 5, 2016 14:42
@andrewkroh andrewkroh merged commit cc989a4 into elastic:master Oct 5, 2016
@monicasarbu monicasarbu deleted the mb-container-meta branch October 5, 2016 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants